Building and Applying Profiles Through Term Extraction
نویسندگان
چکیده
This paper proposes a technique to build entity profiles starting from a set of defining corpora, i.e., a corpus considered as the definition of each entity. The proposed technique is applied in a classification task in order to determine how much a text, or corpus, is related to each of the profiled entities. This technique is general enough to be applied to any kind of entity, however, this paper experiments are conduct over entities describing a set of professors of a computer science graduate school through their advised M.Sc. thesis and Ph.D. dissertations. The profiles of each entity are applied to categorize other texts into one of the builded profiles. The analysis of the obtained results illustrates the power of the proposed technique.
منابع مشابه
Screening of the profiles of the essential oils from the aerial parts of Nepeta racemosa using classical and microwave-based methods: Comparison with the volatiles using headspace solid-phase micro-extraction
Background & Aim:Nepeta racemosa is an herbal and medicinal plant and this report aims to identify chemical compositions of the essential oils and volatiles of its aerial parts through classical and advanced methods. Experimental: Chemical profiles of the essential oils and volatile compounds from the aerial parts of Nepeta racemosa obtai...
متن کاملScreening of the profiles of the essential oils from the aerial parts of Nepeta racemosa using classical and microwave-based methods: Comparison with the volatiles using headspace solid-phase micro-extraction
Background & Aim:Nepeta racemosa is an herbal and medicinal plant and this report aims to identify chemical compositions of the essential oils and volatiles of its aerial parts through classical and advanced methods. Experimental: Chemical profiles of the essential oils and volatile compounds from the aerial parts of Nepeta racemosa obtai...
متن کاملProtein Remote Homology Detection Based on Binary Profiles
Remote homology detection is a key element of protein structure and function analysis in computational and experimental biology. This paper presents a simple representation of protein sequences, which uses the evolutionary information of profiles for efficient remote homology detection. The frequency profiles are directly calculated from the multiple sequence alignments outputted by PSI-BLAST a...
متن کاملPrimary Data Encoding of a Bilingual Corpus
This paper discusses the building of a bilingual corpus of legal and administrative texts, focusing on the encoding of documentation and structural information according to the Corpus Encoding Standard. The corpus is one module in an ongoing research project about (semi-)automatic terminology acquisition at the European Academy Bolzano and will serve as a basis for applying term extraction prog...
متن کاملEmploying Information Extraction for Building Mobile Applications
We describe a SMS-based information system called CATS, which allows posting and searching through free Arabic text using Information Extraction (IE) technology. We discuss the challenges of applying IE technology for unedited real Arabic text. In addition, we describe the structure of this system and our approach to produce an open robust system capable of including more sub domains with the m...
متن کامل